Mutations in DNMT3A and IDH1/2 are each found in ~20% of AML patients. 10-15% of AMLs carry mutations in both genes (herein, double mutants), resulting in a unique methylation landscape and upregulation of a signaling signature. In murine models, the presence of both mutations results in greater leukemogenic potential. However, the specific mechanism through which DNA methylation (DNAme) drives gene expression programs in double mutants remains unclear. We hypothesized that the link between DNAme and gene expression would be explained by more than simple proximity, and that the genomic architecture of the affected genes would play a key role.

To test this, we first performed an unbiased correlation analysis of gene expression with DNAme at all CpG sites (mCs) located within the same topologically associated domain (TAD). We identified 406 genes with significant (FDR> 5% and absolute rho > 0.5) expression-methylation correlations with mCs proximal to the respective genes (herein the E-M gene set). In addition, another 2,088 genes (the L E-M set) were identified with long-range correlations (>2Kb from the gene body) with mCs in the respective TAD (median distance = 451 Kb). As a set, the E-M genes significantly overlapped (P < 10 -2) with genes identified as either differentially expressed (DE; n=890) or differentially methylated (DM; n= 4,006) between IDH1/2 and DNMT3A mutant AMLs. Notably, a simple overlap analysis of DE and DM genes showed no significant overlap between them, thus demonstrating that correlation analysis performed better in bridging the epigenome with the transcriptome. DAVID and Gene Set Enrichment Analysis on the genes ranked by correlation strength revealed that signaling, fructose and lipid metabolism pathways are enriched in the E-M gene set (FDR < 5%) but not in the L E-M set. Analysis of transcription factor (TF) binding profiles did not reveal a common set of TF(s) binding to the mCs proximal to the genes of the identified pathways. Thus, we hypothesized that the E-M genes have other structural characteristics in common that drive regulation through DNAme, for which we focused on their genomic architecture. This analysis revealed that introns of genes in both the E-M and L E-M sets are significantly denser in Mammalian Interspersed Repeats (MIR) than expected by random chance (P < 10 -2). Additionally, E-M genes were significantly sparser in endogenous retroviruses (ERVL) and primate-specific Alu elements. mCs with significant correlations were also enriched at MIR and depleted from Alu elements (P < 10 -2), thus creating a regulatory network between mCs and genes with MIR sequences as the common denominator. Genome-wide, CpGs within retrotransposons that were differentially methylated among the three AML subtypes were enriched at enhancer regions or coding genes, particularly the E-M genes. Furthermore, the Dnmt3a knock-out (KO) or Idh2 R140Q knock-in mouse models display the same architectural biases at genes correlated with DNAme as the E-M genes identified in the human samples. Next, we sought to put our findings in the context of normal hematopoiesis and found that genes upregulated during normal hematopoietic differentiation are significantly denser in MIR elements and sparser of Alu elements than expected (P < 10 -2). Alignment of the leukemic samples within normal differentiation trajectories revealed that double mutants resembled differentiated cell types more closely, while DNMT3A and IDH1/2 single mutants resembled hematopoietic stem cells. The E-M and L E-M sets significantly overlapped (P < 10 -2) with those genes upregulated during myeloid but not erythroid or lymphoid differentiation, demonstrating that genes regulated by DNAme are at the core of the biology of these AMLs.

In summary, our integrative work sheds light on a novel mechanism in which epigenetic modifications can regulate gene expression through MIR sequences within introns of hematopoietic-relevant genes and we posit that overlapping CpG dinucleotides may act as recruiters or substrates of DNMT3A and/or TET proteins. This mechanism seems to also be active in normal hematopoiesis and thus, is hijacked by leukemic cells. Therefore, our findings identify retrotransposons as a missing link in the understanding of epigenetic regulation of gene expression, reveal a previously uncharacterized role for these elements in leukemogenesis, and point to different cells of origin for each AML subtype.

Disclosures

No relevant conflicts of interest to declare.

Sign in via your Institution